Farsi Searching and Display Technologies
نویسندگان
چکیده
In this paper,we report on our ongoing research for the development of a Unicode-based search engine for Farsi. The activities consist of an I/O subsystem, Farsi stemmer, test collection preparation, and the search engine itself. This engine is intended to be independent of the operating system platform using no special hardware or software. Weare further planning to tune the system for other languages with Arabic related scripts.
منابع مشابه
A word spotting method for Farsi machine-printed document images
In this paper, a word spotting approach for Farsi printed document images has been presented. The main idea of the paper is the font recognition of Farsi document images and query word modification according to the document image’s font before searching. This operation increases the similarity between the query word image and its instances in the document image; therefore, the performance of th...
متن کاملA Semantic Approach to Person Profile Extraction from Farsi Documents
Entity profiling (EP) as an important task of Web mining and information extraction (IE) is the process of extracting entities in question and their related information from given text resources. From computational viewpoint, the Farsi language is one of the less-studied and less-resourced languages, and suffers from the lack of high quality language processing tools. This problem emphasizes th...
متن کاملThe Problems of Desktop Indexing of a Book Translated into a Non-Roman Script: Description of a Real Experience
Zarnegar (gold writer) is a word processor widely used by publishers of both scholarly journals and books in Iran. Although it is gradually substituted by Word for Windows that is much more powerful than Zarnegar, the process seems to be slow and most Iranian publishers still prefer to receive manuscripts in Zarnegar than Word. There are many reasons for this preference: Word, though having man...
متن کاملSelection of single-chain variable fragments specific for Mycobacterium tuberculosis ESAT-6 antigen using ribosome display
Objective(s): Tuberculosis (TB) is still one of the problematic infectious diseases in developing countries, especially in Iran. In the present study, we applied ribosome display technique to select single chain variable fragments (scFvs) specific for the 6-kDa early secretory antigenic target (ESAT-6) antigen of Mycobacterium tuberculosis from a mouse scFv library. Materials and Methods: The g...
متن کاملDesigning a Distributed search engine for Farsi/English web pages
In this paper we have tried to model, design and test a prototype of Farsi/English search engine. The engine has the duty of covering the web media features such as heterogeneity, volatility and huge amount of unstructured worldwide information. These features as well as the rapid advance in technology, challenge the effectiveness of classical Information Retrieval (IR) techniques. Although a g...
متن کامل